Policy Evaluation and Optimization with Continuous Treatments
نویسندگان
چکیده
We study the problem of policy evaluation and learning from batched contextual bandit data when treatments are continuous, going beyond previous work on discrete treatments. Previous work for discrete treatment/action spaces focuses on inverse probability weighting (IPW) and doubly robust (DR) methods that use a rejection sampling approach for evaluation and the equivalent weighted classification problem for learning. In the continuous setting, this reduction fails as we would almost surely reject all observations. To tackle the case of continuous treatments, we extend the IPW and DR approaches to the continuous setting using a kernel function that leverages treatment proximity to attenuate discrete rejection. Our policy estimator is consistent and we characterize the optimal bandwidth. The resulting continuous policy optimizer (CPO) approach using our estimator achieves convergent regret and approaches the best-in-class policy for learnable policy classes. We demonstrate that the estimator performs well and, in particular, outperforms a discretization-based benchmark. We further study the performance of our policy optimizer in a case study on personalized dosing based on a dataset of Warfarin patients, their covariates, and final therapeutic doses. Our learned policy outperforms benchmarks and nears the oracle-best linear policy.
منابع مشابه
Gradient-based Ant Colony Optimization for Continuous Spaces
A novel version of Ant Colony Optimization (ACO) algorithms for solving continuous space problems is presented in this paper. The basic structure and concepts of the originally reported ACO are preserved and adaptation of the algorithm to the case of continuous space is implemented within the general framework. The stigmergic communication is simulated through considering certain direction vect...
متن کاملGradient-based Ant Colony Optimization for Continuous Spaces
A novel version of Ant Colony Optimization (ACO) algorithms for solving continuous space problems is presented in this paper. The basic structure and concepts of the originally reported ACO are preserved and adaptation of the algorithm to the case of continuous space is implemented within the general framework. The stigmergic communication is simulated through considering certain direction vect...
متن کاملDISCRETE AND CONTINUOUS SIZING OPTIMIZATION OF LARGE-SCALE TRUSS STRUCTURES USING DE-MEDT ALGORITHM
Design optimization of structures with discrete and continuous search spaces is a complex optimization problem with lots of local optima. Metaheuristic optimization algorithms, due to not requiring gradient information of the objective function, are efficient tools for solving these problems at a reasonable computational time. In this paper, the Doppler Effect-Mean Euclidian Distance Threshold ...
متن کاملContinuous Discrete Variable Optimization of Structures Using Approximation Methods
Optimum design of structures is achieved while the design variables are continuous and discrete. To reduce the computational work involved in the optimization process, all the functions that are expensive to evaluate, are approximated. To approximate these functions, a semi quadratic function is employed. Only the diagonal terms of the Hessian matrix are used and these elements are estimated fr...
متن کاملFORCED WATER MAIN DESIGN MIXED ANT COLONY OPTIMIZATION
Most real world engineering design problems, such as cross-country water mains, include combinations of continuous, discrete, and binary value decision variables. Very often, the binary decision variables associate with the presence and/or absence of some nominated alternatives or project’s components. This study extends an existing continuous Ant Colony Optimization (ACO) algorithm to simultan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.06037 شماره
صفحات -
تاریخ انتشار 2018